6 research outputs found
Metoda projektovanja namenskih programabilnih hardverskih akceleratora
Namenski računarski sistemi se najčesće projektuju tako da mogu da podrže
izvršavanje većeg broja željenih aplikacija. Za postizanje što veće efikasnosti,
preporučuje se korišćenje specijalizovanih procesora Application Specific Instruction
Set Processors–ASIPs, na kojima se izvršavanje programskih instrukcija obavlja u za to
projektovanim i nezavisnimhardverskim blokovima (akceleratorima). Glavni razlog za
postojanje nezavisnih akceleratora jeste postizanjemaksimalnog ubrzanja izvršavanja
instrukcija. Me ¯ dutim, ovakav pristup podrazumeva da je za svaki od blokova potrebno
projektovati integrisano (ASIC) kolo, čime se bitno povećava ukupna površina procesora.
Metod za smanjenje ukupne površine jeste primena DatapathMerging tehnike na
dijagrame toka podataka ulaznih aplikacija. Kao rezultat, dobija se jedan programabilni
hardverski akcelerator, sa mogućnosću izvršavanja svih željenih instrukcija. Međutim,
ovo ima negativne posledice na efikasnost sistema.
često se zanemaruje činjenica da, usled veoma ograničene fleksibilnosti ASIC hardverskih
akceleratora, specijalizovani procesori imaju i drugih nedostataka. Naime, u
slučaju izmena, ili prosto nadogradnje, specifikacije procesora u završnimfazama projektovanja,
neizbežna su velika kašnjenja i dodatni troškovi promene dizajna. U ovoj
tezi je pokazano da zahtevi za fleksibilnošću i efikasnošću ne moraju biti međusobno
isključivi. Demonstrirano je je da je moguce uneti ograničeni nivo fleksibilnosti hardvera
tokom dizajn procesa, tako da dobijeni hardverski akcelerator može da izvršava
ne samo aplikacije definisane na samom početku projektovanja, već i druge aplikacije,
pod uslovom da one pripadaju istom domenu. Drugim rečima, u tezi je prezentovana
metoda projektovanja fleksibilnih namenskih hardverskih akceleratora. Eksperimentalnom evaluacijom pokazano je da su tako dobijeni akceleratori u većini slučajeva
samo do 2 x veće površine ili 2 x većeg kašnjenja od akceleratora dobijenih primenom
DatapathMerging metode, koja pritom ne pruža ni malo dodatne fleksibilnosti.Typically, embedded systems are designed to support a limited set of target
applications. To efficiently execute those applications, they may employ Application
Specific Instruction Set Processors (ASIPs) enriched with carefully designed Instructions
Set Extension (ISEs) implemented in dedicated hardware blocks. The primary goal
when designing ISEs is efficiency, i.e. the highest possible speedup, which implies
synthesizing all critical computational kernels of the application dataflow graphs as
an Application Specific Integrated Circuit (ASICs). Yet, this can lead to high on-chip
area dedicated solely to ISEs. One existing approach to decrease this area by paying
a reasonable price of decreased efficiency is to perform datapath merging on input
dataflow graphs (DFGs) prior to generating the ASIC.
It is often neglected that even higher costs can be accidentally incurred due to the lack
of flexibility of such ISEs. Namely, if late design changes or specification upgrades happen,
significant time-to-market delays and nonrecurrent costs for redesigning the ISEs
and the corresponding ASIPs become inevitable. This thesis shows that flexibility and
efficiency are not mutually exclusive. It demonstrates that it is possible to introduce a
limited amount of hardware flexibility during the design process, such that the resulting
datapath is in fact reconfigurable and thus can execute not only the applications known
at design time, but also other applications belonging to the same application-domain.
In other words, it proposes a methodology for designing domain-specific reconfigurable
arrays out of a limited set of input applications. The experimental results show that
resulting arrays are usually around 2£ larger and 2£ slower than ISEs synthesized using
datapath merging, which have practically null flexibility beyond the design set of DFGs
CAUSAL MODELS OF ELECTRICALLY LARGE AND LOSSY DIELECTRIC BODIES
This paper presents a novel formula for the complex permittivity of lossy dielectrics, which is valid in a broad frequency range and is ensuring a causal impulse response in the time domain. The application of this formula is demonstrated through the analysis of wet soil, where the coefficients of the formula are tuned to match the measured data from the literature. Additionally, an analytical expression for the impulse response of the relative permittivity is derived. The influence of the frequency dependence of the complex permittivity on the causality of responses is illustrated through the analysis of 1‑D, 2‑D, and 3‑D electromagnetic systems. Being the most complex, the 3‑D system is also used as a test bed for comparing the computational limitations of two commercially available solvers, CST and WIPL‑D
The European Project STRUCTURES : Challenges and Results
The project STRUCTURES, funded by the European Union, started in July 2012 to study problems related to the emerging threats of electromagnetic attacks to critical infrastructures. Partners of the team have worked to list possible threats, identify the main characteristics of the critical infrastructures our way of living depends on, test current protection strategies with different simulation and measurement techniques, and condensate the results in guidelines accessible to an audience wider than the one of people working in the field. Here, we summarize the challenges, the solutions, and the results of almost three years of work
Metoda projektovanja namenskih programabilnih hardverskih akceleratora
Namenski računarski sistemi se najčesće projektuju tako da mogu da podrže
izvršavanje većeg broja željenih aplikacija. Za postizanje što veće efikasnosti,
preporučuje se korišćenje specijalizovanih procesora Application Specific Instruction
Set Processors–ASIPs, na kojima se izvršavanje programskih instrukcija obavlja u za to
projektovanim i nezavisnimhardverskim blokovima (akceleratorima). Glavni razlog za
postojanje nezavisnih akceleratora jeste postizanjemaksimalnog ubrzanja izvršavanja
instrukcija. Me ¯ dutim, ovakav pristup podrazumeva da je za svaki od blokova potrebno
projektovati integrisano (ASIC) kolo, čime se bitno povećava ukupna površina procesora.
Metod za smanjenje ukupne površine jeste primena DatapathMerging tehnike na
dijagrame toka podataka ulaznih aplikacija. Kao rezultat, dobija se jedan programabilni
hardverski akcelerator, sa mogućnosću izvršavanja svih željenih instrukcija. Međutim,
ovo ima negativne posledice na efikasnost sistema.
često se zanemaruje činjenica da, usled veoma ograničene fleksibilnosti ASIC hardverskih
akceleratora, specijalizovani procesori imaju i drugih nedostataka. Naime, u
slučaju izmena, ili prosto nadogradnje, specifikacije procesora u završnimfazama projektovanja,
neizbežna su velika kašnjenja i dodatni troškovi promene dizajna. U ovoj
tezi je pokazano da zahtevi za fleksibilnošću i efikasnošću ne moraju biti međusobno
isključivi. Demonstrirano je je da je moguce uneti ograničeni nivo fleksibilnosti hardvera
tokom dizajn procesa, tako da dobijeni hardverski akcelerator može da izvršava
ne samo aplikacije definisane na samom početku projektovanja, već i druge aplikacije,
pod uslovom da one pripadaju istom domenu. Drugim rečima, u tezi je prezentovana
metoda projektovanja fleksibilnih namenskih hardverskih akceleratora. Eksperimentalnom evaluacijom pokazano je da su tako dobijeni akceleratori u većini slučajeva
samo do 2 x veće površine ili 2 x većeg kašnjenja od akceleratora dobijenih primenom
DatapathMerging metode, koja pritom ne pruža ni malo dodatne fleksibilnosti.Typically, embedded systems are designed to support a limited set of target
applications. To efficiently execute those applications, they may employ Application
Specific Instruction Set Processors (ASIPs) enriched with carefully designed Instructions
Set Extension (ISEs) implemented in dedicated hardware blocks. The primary goal
when designing ISEs is efficiency, i.e. the highest possible speedup, which implies
synthesizing all critical computational kernels of the application dataflow graphs as
an Application Specific Integrated Circuit (ASICs). Yet, this can lead to high on-chip
area dedicated solely to ISEs. One existing approach to decrease this area by paying
a reasonable price of decreased efficiency is to perform datapath merging on input
dataflow graphs (DFGs) prior to generating the ASIC.
It is often neglected that even higher costs can be accidentally incurred due to the lack
of flexibility of such ISEs. Namely, if late design changes or specification upgrades happen,
significant time-to-market delays and nonrecurrent costs for redesigning the ISEs
and the corresponding ASIPs become inevitable. This thesis shows that flexibility and
efficiency are not mutually exclusive. It demonstrates that it is possible to introduce a
limited amount of hardware flexibility during the design process, such that the resulting
datapath is in fact reconfigurable and thus can execute not only the applications known
at design time, but also other applications belonging to the same application-domain.
In other words, it proposes a methodology for designing domain-specific reconfigurable
arrays out of a limited set of input applications. The experimental results show that
resulting arrays are usually around 2£ larger and 2£ slower than ISEs synthesized using
datapath merging, which have practically null flexibility beyond the design set of DFGs
RDS: FPGA Routing Delay Sensors for Effective Remote Power Analysis Attacks
State-of-the-art sensors for measuring FPGA voltage fluctuations are time-to-digital converters (TDCs). They allow detecting voltage fluctuations in the order of a few nanoseconds. The key building component of a TDC is a delay line, typically implemented as a chain of fast carry propagation multiplexers. In FPGAs, the fast carry chains are constrained to dedicated logic and routing, and need to be routed strictly vertically. In this work, we present an alternative approach to designing on-chip voltage sensors, in which the FPGA routing resources replace the carry logic. We present three variants of what we name a routing delay sensor (RDS): one vertically constrained, one horizontally constrained, and one free of any constraints. We perform a thorough experimental evaluation on both the Sakura-X side-channel evaluation board and the Alveo U200 datacenter card, to evaluate the performance of RDS sensors in the context of a remote power side-channel analysis attack. The results show that our best RDS implementation in most cases outperforms the TDC. On average, for breaking the full 128-bit key of an AES-128 cryptographic core, an adversary requires 35% fewer side-channel traces when using the RDS than when using the TDC. Besides making the attack more effective, given the absence of the placement and routing constraint, the RDS sensor is also easier to deploy